136 research outputs found
Blind Source Separation with Optimal Transport Non-negative Matrix Factorization
Optimal transport as a loss for machine learning optimization problems has
recently gained a lot of attention. Building upon recent advances in
computational optimal transport, we develop an optimal transport non-negative
matrix factorization (NMF) algorithm for supervised speech blind source
separation (BSS). Optimal transport allows us to design and leverage a cost
between short-time Fourier transform (STFT) spectrogram frequencies, which
takes into account how humans perceive sound. We give empirical evidence that
using our proposed optimal transport NMF leads to perceptually better results
than Euclidean NMF, for both isolated voice reconstruction and BSS tasks.
Finally, we demonstrate how to use optimal transport for cross domain sound
processing tasks, where frequencies represented in the input spectrograms may
be different from one spectrogram to another.Comment: 22 pages, 7 figures, 2 additional file
Dual Gauss-Newton Directions for Deep Learning
Inspired by Gauss-Newton-like methods, we study the benefit of leveraging the
structure of deep learning objectives, namely, the composition of a convex loss
function and of a nonlinear network, in order to derive better direction
oracles than stochastic gradients, based on the idea of partial linearization.
In a departure from previous works, we propose to compute such direction
oracles via their dual formulation, leading to both computational benefits and
new insights. We demonstrate that the resulting oracles define descent
directions that can be used as a drop-in replacement for stochastic gradients,
in existing optimization algorithms. We empirically study the advantage of
using the dual formulation as well as the computational trade-offs involved in
the computation of such oracles.Comment: Presented at the Duality Principles for Modern Machine Learning
Workshop at ICML 202
- …